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What  is  Biosurveillance? 


•  Homeland  Security  Presidential  Directive 
HSPD-21  (October  18,  2007): 

-  “The  term  ‘biosurveillance’  means  the  process  of  active  data- 
gathering  ...  of  biosphere  data  ...  in  order  to  achieve  early 
warning  of  health  threats,  early  detection  of  health  events,  and 

overall  situational  awareness  of  disease  activity.” 

-  “The  Secretary  of  Health  and  Human  Services  shall  establish 
an  operational  national  epidemiologic  surveillance  system  for 

human  health...” 

•  Epidemiologic  surveillance: 

-  “..  .surveillance  using  health-related  data  that  precede 
diagnosis  and  signal  a  sufficient  probability  of  a  case  or  an 
outbreak  to  warrant  further  public  health  response.’’ 


[1  ]  www.whitehouse.gov/news/releases/2007/1 0/2007101 8-1 0.html 
[2]  GDC  (www.cdc.aov/epo/dphsi/svndromic.htm,  accessed  5/29/07) 
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An  Existing  System:  BioSense 


#  DoD  Facilities  (366) 
^  VA  Facilities  (886) 


US  Population 
by  County 

Low 


Medium 

High 


BoSense 

Division  of  Emergency  Preparedness  and  Response 
Centers  of  Disease  Control  and  Prevention 
February  27,  2007 
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Think  of  It  Like  a 
Large  System  of  Sensors 


Issue:  False  alarms  a  serious  problem 

-  “...most  health  monitors.. .  learned  to  ignore  alarms  triggered  by 
their  system.  This  is  due  to  the  excessive  false  alarm  rate  that  is 
typical  of  most  systems  -  there  is  nearly  an  alarm  every  day!’ 


[1]  https://wiki.cirg.washington.edu/pub/bin/view/lsds/SurveillanceSystemslnPractice 
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The  Problem  in  Summary 


•  Goal:  Early  detection  of 
disease  outbreak  and/or 
bioterrorism 


Issue:  Currently  detection 
thresholds  set  naively 

-  Equally  for  all  sensors 

-  Ignores  differential 
probability  of  attack 

Result: 

-  High  false  alarm  rates 

-  Loss  of  credibility 
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Formal  Description  of  the  System 


Let  X^^  denote  the  output  from  sensor  i  at 
time  t, 

-  Each  sensor  /  location  has  aprobability  of 
outbreak / attack: 

-  If  no  “event  of  interest”  anywhere  in  the 
network,  for  all  i  and  t 

-  If  an  event  of  interest  occurs  at  time  x, 

for  exactly  one  i 

A  signal  is  generated  at  time  x*  when 
X.  ^>h  for  one  or  more  i 

IT^  i 
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Distribution  of  Background 
Disease  Incidence  (/q) 


Distribution  of 
Background  Incidence 
and  Attack/Outbreak  {f^ 


of  Threshold  Detection 


Probability  of  a  true  signal: 

/•CO 

f  f^{x)dx  =  \-F^{h) 

J  x=h 


Probability  of  a  false  signal 

/•OO 

J  x-h 


h 
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It’s  All  About  Choosing  Thresholds 


•  For  each  sensor,  choice  of  h  is 
compromise  between  probability 
of  true  and  false  signals 


ROC  Curve 


D) 


C/) 


Q_ 


1.0 1 — ' — ' — ' — I — ' — ' — ' — I — ' — ' — ' — I — ' — ' — ' — I — ' — ' — ^ 


0.8 
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0.4 
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0.0  0.2  0.4  0.6  0.8  1.0 


Pr(signal  |  no  attack) 
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Mathematical  Formulation 

of  the  Problem 


It’s  simple  to  write  out: 

Pr(detection)  =  ^Pr(signallattack)Pr( attack) 

i 

E(#  false  signals)  =  ^Pr(signallno  attack) 

i 

Express  it  as  an  NLP  optimization 
problem: 

mp  2]  [l-F,(/i, .)]/?,. 

h 

i 

s.t.  2][1-Fo(/z,.)]</c 

i 

9 
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Some  Assumptions 


\ 

•  Sensors  are  spatially  independent 

^  •  Monitoring  standardized  residuals  from  an 

*"  “adaptive  regression”  model 


-  Model  accounts  for  (and  removes)  systematic 
effects  in  the  data 

-  Result:  Reasonable  to  assume  Fo=N(0,1) 

An  attack  will  result  in  a  2-sigma  increase  in 
the  mean  of  the  residuals 

-  Result:  F^=N{2,^) 

Then,  NLP  is:  min 

^  i 

s.t.  y^,0(/z,.)  >  n  —  K 
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Ten  Sensor  Example 


Sensor  i 

Pi 

CollllllOll 
Threshold  #1 

Optimal 
Threshold  (1^) 

Common 
Threshold  #2 

1 

0.797 

2.189 

1.0G8 

1.310 

2 

0.004 

2.189 

3.G()2 

1.310 

3 

().()56 

2.189 

3.732 

1.310 

4 

0.048 

2.189 

3.915 

1.310 

5 

0.013 

2.189 

4.G5G 

1.310 

G 

0.000 

2.189 

4.73G 

1.310 

7 

0.000 

2.189 

4.73G 

1.310 

8 

().()()5 

2.189 

4.755 

1.310 

9 

0.003 

2.189 

4.773 

1.310 

10 

0.002 

2.189 

4.791 

1.310 

P, 

0.117 

0.378 

0.378 

0.143 

0.143 

0.951 
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Simplifying  to  a  One-dimensionai 
_ Optimization  Problem 


System  of  n  hospitals  (sensors)  means 
optimization  has  n  free  parameters 

-  Hard  for  to  solve  for  large  systems 


Can  simplify  to  one-parameter  problem: 

-  Theorem:  For  Fo=N(0,1)  and  Fi=N(y,1),  the 
optimization  simplifies  to  finding  )a  to  satisfy 

n  ^  ^  \ 


\ 


r 


=  n  —  K, 


J 


and  the  optimal  thresholds  are  then 

1 

h.  ^ju - ln(p.). 

7 
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Consider  (Hypothetical)  System  to 
Monitor  200  Laraest  Cities  in  US 


Assume  probability  of  attack  is  proportional 
to  the  population  in  a  city:  p,  =  m.  / 
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•  Assume 


Optimal  Solution  for  200  Cities 


-  2o  magnitude  event 

-  Constraint  of  1  false  signal  system-wide  /  day 


Population 

Pr(attack) 

Threshold 

Pr(signal  | 
attack) 

Pr(signal  | 
no  attack) 

7 

i 

City 

State 

nt 

Pi=n»i/M 

hi 

8 

1 

New  York  city 

New  York 

8,214,426 

0 . 1101 

1.07 

0.825 

0 . 143 

9 

2 

Los  Angeles 

California 

3,849,378 

0 . 0516 

1.45 

0.710 

0.074 

10 

3 

Chicago 

Illinois 

2,833,321 

0 . 0380 

1 . 60 

0. 656 

0.055 

11 

4 

Houston 

Texas 

2, 144,491 

0 . 0287 

1 . 74 

0. 603 

0.041 

12 

5 

Phoenix 

Ar i zona 

1,512,986 

0 . 0203 

1 .91 

0.535 

0.028 

13 

6 

Philadelphia 

Pennsylvania 

1,448,394 

0 .0194 

1.93 

0.52  6 

0.027 

14 

7 

San  Antonio 

Texas 

1,296, 682 

0 .0174 

1.99 

0.504 

0.023 

15 

8 

San  Diego 

California 

1,256,951 

0 .0168 

2 .01 

0.498 

0.022 

16 

9 

Dallas 

Texas 

1,232,940 

0 .0165 

2 .01 

0.494 

0.022 

17 

10 

San  Jose 

California 

929,936 

0 .0125 

2 . 16 

0.438 

0.016 

•  Result:  Pr(signal  |  attack)  =  0.388 

•  Naive  result:  Pr(signal  |  attack)  =  0.283^4 
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Ph  -  False  Alarm  Trade-Off 
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Choosing  y  and  k 


Optimal  probability  of  detection  for 
various  choices  of  y  and  k 


Prf 

K  =  1 

H  =  2 

K  =  3 

K  =  4 

K  =  5 

7  =  1 

0.165 

0.228 

0.272 

0.307 

0.336 

^  —  9 

0.388 

0.481 

0.540 

0.583 

0.618 

7  =  3 

0.726 

0.801 

0.840 

0.866 

0.885 

7  =  4 

0.939 

0.964 

0.974 

0.980 

0.984 

-  Choice  of  k  depends  on  available  resources 

-  Setting  y  is  subjective:  what  size  mean 
increase  important  to  detect? 

WWW.NPS.EDU 


NAVAL 

POSTGRADUATE 
SCHOOL 


Sensitivity  Anaiyses 


Optimal  probability  of  detection 


Pd 

H  =  1  H  =  2 

K  —  3  K  =  4 

K  =  5 

7  =  1 

0.165  0.228 

0.272  0.307 

0.336 

7  =  2 

0.388  0.481 

0.540  0.583 

0.618 

0'  =  3 

0.726  0.801 

0.840  0.866 

0.885 

7  =  4 

0.939  0.964 

0.974  0.980 

0.984 

Actual  probability  of  detection 

Pd 

K  =  1  H  =  2 

K  =  3  K  =  4 

K  =  5 

()l)serve(i  7  =  1 

0.137  0.193 

0.235  0.269 

0.298 

()l)serve(l  7  =  2 

0.388  0.481 

0.540  0.583 

0.618 

()l)serve(l  7  =  3 

0.711  0.790 

0.832  0.859 

0.879 

()l)serve(l  7  =  4 

0.925  0.955 

0.968  0.976 

0.981 
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Optimizing  a  County-level  System 


Legend 

Probability  of  Attack 

□  0.0000  -  0.0004 
I  I  0.0005-0.0011 

I  I  0.0012-0.0019 

□  0.0020  -  0.0028 
0.0029  -  0.0046 


0.0047  -  0.0084 


0.0085-0.0177 


0.0178-0.0332 


Thresholds 


Pacific 

Ocean 


Gulf  of  Mexico 
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Thresholds  as  a  Function  of 
Probability  of  Attack 


Probability  of  Attack  (if^) 
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Take-Aways 


BioSense  and  other  biosurveillance  systems’ 

performance  can  be  improved  now  at  no  cost 

Approach  allows  for  customization 

-  E.g.,  increase  in  probability  of  detection  at 
individual  location  or  add  additional  constraint  to 
minimize  false  signals 

Applies  to  other  sensor  system  applications: 

-  Port  surveillance,  radiation/chem  detection 
systems,  etc. 

Details  in  Pricker  and  Banschbach  (2007) 


20 
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Future  Research  Directions 


•  Assess  data  fusion  techniques  for  use 
when  multiple  sensors  in  each  region 

-  I.e.,  relax  sensor  (spatial)  independence 
assumption 

•  Generalize  from  threshold  detection 
methods  to  other  methods  that  use 
historical  information 

-  I.e.,  relax  temporal  independence 
assumption 


21 
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